Skip to content

feat(safety): add opt-in tool script safety guard with manifest validation#121

Open
YAO-001 wants to merge 12 commits into
trpc-group:mainfrom
YAO-001:codex/tool-script-safety-guard
Open

feat(safety): add opt-in tool script safety guard with manifest validation#121
YAO-001 wants to merge 12 commits into
trpc-group:mainfrom
YAO-001:codex/tool-script-safety-guard

Conversation

@YAO-001

@YAO-001 YAO-001 commented Jul 4, 2026

Copy link
Copy Markdown

English

Summary

Addresses #90.

This PR adds an opt-in Tool Script Safety Guard for pre-execution scanning of Python and Bash-like tool scripts.

It provides static scanning, policy-based decisions, structured safety reports, sanitized audit events, OpenTelemetry-compatible attributes, and opt-in integration points for Filter/Wrapper, BashTool, and UnsafeLocalCodeExecutor.

This guard is a static pre-execution control. It is not a sandbox and does not replace process isolation, least-privilege filesystem permissions, network egress controls, resource limits, or runtime audit/monitoring.

What Changed

  • Added trpc_agent_sdk.tools.safety with ToolScriptSafetyScanner, ToolSafetyPolicy, ToolSafetyFilter, ToolSafetyWrapper, custom safety rule registration APIs, and structured report, audit, and telemetry helpers.
  • Added Python and Bash scanning rules for secret reads, credential file access, dangerous delete operations, non-whitelisted network access, secret exfiltration, process execution, dependency installation, privilege escalation, dynamic code execution, shell features requiring review, and resource exhaustion patterns.
  • Added YAML policy support with examples/tool_safety/tool_safety_policy.yaml, strict policy validation for CI/review usage, and compatibility-preserving non-strict mode that warns and ignores invalid fields.
  • Added shared input extraction for Tool/Skill/MCP-like payloads, including top-level script/code/command fields, python_code / bash_code, code_blocks, args / argv / command_args, and nested payloads such as tool_input, params.arguments, and input.
  • Added opt-in core integrations for BashTool(enable_safety_guard=True, ...) and UnsafeLocalCodeExecutor(enable_safety_guard=True, ...).
  • Added CLI tooling: scripts/tool_safety_check.py and scripts/tool_safety_manifest_report.py.
  • Added manifest-driven examples and deterministic report artifact: examples/tool_safety/samples/manifest.yaml and examples/tool_safety/all_reports.json.

Issue #90 Acceptance Mapping

  • Python and Bash scanning: trpc_agent_sdk/tools/safety/_scanner.py, trpc_agent_sdk/tools/safety/_rules.py
  • Configurable policy YAML: trpc_agent_sdk/tools/safety/_policy.py, examples/tool_safety/tool_safety_policy.yaml
  • Three-way decision model: allow, deny, needs_human_review
  • Filter / wrapper pre-execution interception: trpc_agent_sdk/tools/safety/_filter.py, trpc_agent_sdk/tools/safety/_wrapper.py
  • Tool / Skill-like / MCP-like payload extraction: trpc_agent_sdk/tools/safety/_extractors.py, examples/tool_safety/skill_wrapper_example.py
  • Core opt-in integrations: trpc_agent_sdk/tools/file_tools/_bash_tool.py, trpc_agent_sdk/code_executors/local/_unsafe_local_code_executor.py
  • Structured report fields: decision, risk_level, rule_id, risk_type, evidence, recommendation
  • Audit and telemetry: trpc_agent_sdk/tools/safety/_audit.py, trpc_agent_sdk/tools/safety/_telemetry.py
  • Documentation: examples/tool_safety/README.md, examples/tool_safety/PR_DESCRIPTION.md

Sample Validation

The sample corpus is manifest-driven.

Current manifest status:

  • 52 samples
  • 52 / 52 decision matches
  • 52 / 52 required-rule matches
  • secret-read samples: no allow
  • dangerous-delete samples: no allow
  • non-whitelisted-network samples: no allow
  • safe samples: no deny
  • 500-line Python and Bash script scanning covered by performance tests

examples/tool_safety/all_reports.json is generated by:

python scripts/tool_safety_manifest_report.py --strict-policy

The committed artifact normalizes dynamic fields for deterministic review:

scan_id = manifest:<file>
timestamp = 1970-01-01T00:00:00+00:00
elapsed duration fields = 0.0

Compatibility

All runtime integrations are opt-in.

  • BashTool does not enable the safety guard by default.
  • UnsafeLocalCodeExecutor does not enable the safety guard by default.
  • Filter/Wrapper usage must be explicitly configured.
  • needs_human_review is not blocked by default unless block_on_review=true.

This preserves existing behavior for current users.

Validation

I ran the safety-focused test and validation suite:

python -m pytest tests/tools/safety/test_policy_validation.py tests/tools/safety/test_extractors.py tests/tools/safety/test_filter.py tests/tools/safety/test_wrapper_extraction_consistency.py -q
python -m pytest tests/tools/safety/test_manifest_report_cli.py tests/tools/safety/test_cli.py -q
python -m pytest tests/tools/safety/test_custom_rules.py tests/tools/safety/test_redaction_privacy.py tests/tools/safety/test_performance.py -q
python -m pytest tests/tools/safety -q
python scripts/tool_safety_manifest_report.py --strict-policy
python scripts/tool_safety_check.py --file examples/tool_safety/samples/dangerous_delete.sh --language bash --policy examples/tool_safety/tool_safety_policy.yaml --strict-policy
python scripts/tool_safety_check.py --file examples/tool_safety/samples/safe_python.py --language python --policy examples/tool_safety/tool_safety_policy.yaml --strict-policy

Expected CLI results:

  • dangerous_delete.sh: deny, exit code 3
  • safe_python.py: allow, exit code 0

I also ran the upstream lint-equivalent checks for the modified safety paths:

python -m yapf --diff <changed trpc_agent_sdk/*.py files>
python -m flake8 trpc_agent_sdk/tools/safety scripts/tool_safety_check.py scripts/tool_safety_manifest_report.py tests/tools/safety

yapf --diff had no output after formatting, and flake8 passed.

I attempted the full repository coverage command:

pytest --cov=trpc_agent_sdk --cov-report=term --cov-fail-under=80 tests/

In my local environment it failed during test collection before reaching the safety changes, due to existing unrelated paths:

  • Claude agent tests hitting a Python 3.11 / Pydantic typing.TypedDict compatibility issue.
  • OpenClaw-related tests missing nanobot.heartbeat.
  • tests/test_cli.py module discovery hitting the same Claude/Pydantic collection error.

The safety-specific tests, manifest report, CLI checks, YAPF, and flake8 validations all passed.

Limitations

This is a static pre-execution guard, not a sandbox.

It can reduce accidental and obvious risky tool execution, but it cannot guarantee protection against obfuscation, runtime-generated code, encoded payloads, interpreter-specific behavior, external binaries, or environment-dependent behavior.

Production deployments should combine this guard with sandboxing, permission isolation, network egress controls, resource limits, runtime audit logs, and monitoring.


中文

概要

解决 #90

本 PR 新增一个默认关闭、需要显式启用的 Tool Script Safety Guard,用于在执行前静态扫描 Python 和 Bash-like 工具脚本。

它提供静态扫描、基于策略的决策、结构化安全报告、脱敏审计事件、兼容 OpenTelemetry 的属性,以及 Filter/Wrapper、BashTool、UnsafeLocalCodeExecutor 的可选集成点。

该 guard 是静态的执行前控制手段,不是沙箱,也不能替代进程隔离、最小权限文件系统权限、网络出口控制、资源限制或运行时审计/监控。

变更内容

  • 新增 trpc_agent_sdk.tools.safety,包含 ToolScriptSafetyScannerToolSafetyPolicyToolSafetyFilterToolSafetyWrapper、自定义安全规则注册 API、结构化报告、审计和遥测辅助模块。
  • 新增 Python 和 Bash 扫描规则,覆盖密钥读取、凭证文件访问、危险删除、非白名单网络访问、密钥外传、进程执行、依赖安装、权限提升、动态代码执行、需要人工复核的 shell 特性,以及资源耗尽模式。
  • 新增 YAML policy 支持,包括 examples/tool_safety/tool_safety_policy.yaml、用于 CI/review 的 strict policy 校验,以及保持兼容的非 strict 模式:对非法字段 warning 并忽略。
  • 新增 Tool/Skill/MCP-like payload 的共享输入提取逻辑,支持顶层 script/code/command 字段、python_code / bash_codecode_blocksargs / argv / command_args,以及 tool_inputparams.argumentsinput 等嵌套 payload。
  • 新增核心可选集成:BashTool(enable_safety_guard=True, ...)UnsafeLocalCodeExecutor(enable_safety_guard=True, ...)
  • 新增 CLI 工具:scripts/tool_safety_check.pyscripts/tool_safety_manifest_report.py
  • 新增基于 manifest 的样例和 deterministic report artifact:examples/tool_safety/samples/manifest.yamlexamples/tool_safety/all_reports.json

Issue #90 验收映射

  • Python 和 Bash 扫描:trpc_agent_sdk/tools/safety/_scanner.pytrpc_agent_sdk/tools/safety/_rules.py
  • 可配置 YAML policy:trpc_agent_sdk/tools/safety/_policy.pyexamples/tool_safety/tool_safety_policy.yaml
  • 三态决策模型:allowdenyneeds_human_review
  • Filter / Wrapper 执行前拦截:trpc_agent_sdk/tools/safety/_filter.pytrpc_agent_sdk/tools/safety/_wrapper.py
  • Tool / Skill-like / MCP-like payload 提取:trpc_agent_sdk/tools/safety/_extractors.pyexamples/tool_safety/skill_wrapper_example.py
  • 核心可选集成:trpc_agent_sdk/tools/file_tools/_bash_tool.pytrpc_agent_sdk/code_executors/local/_unsafe_local_code_executor.py
  • 结构化报告字段:decisionrisk_levelrule_idrisk_typeevidencerecommendation
  • 审计和遥测:trpc_agent_sdk/tools/safety/_audit.pytrpc_agent_sdk/tools/safety/_telemetry.py
  • 文档:examples/tool_safety/README.mdexamples/tool_safety/PR_DESCRIPTION.md

样例校验

样例语料由 manifest 驱动。

当前 manifest 状态:

  • 52 个样例
  • 52 / 52 decision 匹配
  • 52 / 52 required rule 匹配
  • secret-read 样例:无 allow
  • dangerous-delete 样例:无 allow
  • non-whitelisted-network 样例:无 allow
  • safe 样例:无 deny
  • 500 行 Python 和 Bash 脚本扫描由性能测试覆盖

examples/tool_safety/all_reports.json 由以下命令生成:

python scripts/tool_safety_manifest_report.py --strict-policy

提交的 artifact 会归一化动态字段,便于 deterministic review:

scan_id = manifest:<file>
timestamp = 1970-01-01T00:00:00+00:00
elapsed duration fields = 0.0

兼容性

所有运行时集成都需要显式启用。

  • BashTool 默认不启用 safety guard。
  • UnsafeLocalCodeExecutor 默认不启用 safety guard。
  • Filter/Wrapper 必须显式配置后才会使用。
  • needs_human_review 默认不会阻断,除非设置 block_on_review=true

这会保留现有用户的默认行为。

验证

我运行了 safety 相关测试和校验命令:

python -m pytest tests/tools/safety/test_policy_validation.py tests/tools/safety/test_extractors.py tests/tools/safety/test_filter.py tests/tools/safety/test_wrapper_extraction_consistency.py -q
python -m pytest tests/tools/safety/test_manifest_report_cli.py tests/tools/safety/test_cli.py -q
python -m pytest tests/tools/safety/test_custom_rules.py tests/tools/safety/test_redaction_privacy.py tests/tools/safety/test_performance.py -q
python -m pytest tests/tools/safety -q
python scripts/tool_safety_manifest_report.py --strict-policy
python scripts/tool_safety_check.py --file examples/tool_safety/samples/dangerous_delete.sh --language bash --policy examples/tool_safety/tool_safety_policy.yaml --strict-policy
python scripts/tool_safety_check.py --file examples/tool_safety/samples/safe_python.py --language python --policy examples/tool_safety/tool_safety_policy.yaml --strict-policy

CLI 预期结果:

  • dangerous_delete.shdeny,退出码 3
  • safe_python.pyallow,退出码 0

我也运行了上游 lint 等价检查:

python -m yapf --diff <changed trpc_agent_sdk/*.py files>
python -m flake8 trpc_agent_sdk/tools/safety scripts/tool_safety_check.py scripts/tool_safety_manifest_report.py tests/tools/safety

格式化后 yapf --diff 无输出,flake8 通过。

我尝试运行全仓库 coverage 命令:

pytest --cov=trpc_agent_sdk --cov-report=term --cov-fail-under=80 tests/

在我的本地环境中,该命令在 test collection 阶段失败,尚未执行到本次 safety 相关变更。失败原因来自现有的无关路径:

  • Claude agent 测试在 Python 3.11 / Pydantic 环境下触发 typing.TypedDict 兼容性问题。
  • OpenClaw 相关测试缺少 nanobot.heartbeat
  • tests/test_cli.py 的模块自动发现触发同一个 Claude/Pydantic collection error。

Safety 相关测试、manifest report、CLI 检查、YAPF 和 flake8 均已通过。

限制

这是静态执行前 guard,不是沙箱。

它可以减少意外或明显高风险的工具执行,但不能保证防护混淆代码、运行时生成代码、编码 payload、解释器特定行为、外部二进制行为或环境相关行为。

生产部署应将该 guard 与沙箱、权限隔离、网络出口控制、资源限制、运行时审计日志和监控结合使用。

@YAO-001

YAO-001 commented Jul 4, 2026

Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

@codecov

codecov Bot commented Jul 4, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 87.87879% with 180 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@73655ab). Learn more about missing BASE report.

Files with missing lines Patch % Lines
trpc_agent_sdk/tools/safety/_rules.py 84.91620% 108 Missing ⚠️
trpc_agent_sdk/tools/safety/_scanner.py 86.93182% 23 Missing ⚠️
trpc_agent_sdk/tools/safety/_extractors.py 92.00000% 8 Missing ⚠️
trpc_agent_sdk/tools/safety/_policy.py 93.60000% 8 Missing ⚠️
trpc_agent_sdk/tools/safety/_filter.py 89.70588% 7 Missing ⚠️
trpc_agent_sdk/tools/safety/_wrapper.py 90.16393% 6 Missing ⚠️
trpc_agent_sdk/tools/file_tools/_bash_tool.py 84.37500% 5 Missing ⚠️
trpc_agent_sdk/tools/safety/_custom_rules.py 90.56604% 5 Missing ⚠️
...ode_executors/local/_unsafe_local_code_executor.py 85.71429% 4 Missing ⚠️
trpc_agent_sdk/tools/safety/_telemetry.py 71.42857% 4 Missing ⚠️
... and 2 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main        #121   +/-   ##
==========================================
  Coverage        ?   87.52665%           
==========================================
  Files           ?         478           
  Lines           ?       45489           
  Branches        ?           0           
==========================================
  Hits            ?       39815           
  Misses          ?        5674           
  Partials        ?           0           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant